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1 . INTRODUCTION 

Various critical values (synonymous with percent points or 
evaluations of the inverse distribution function) of the classi- 
cal Student's t distribution are frequently useful in applied 
statistics. Selected such values are of course widely tabulated; 
see Fisher and Yates (1963), and Pearson and Hartley (1976); 
the latter also are reproduced, with extensions by E.T. Federighi, 
in Abramowitz and Stegun (1968) . In certain circumstances, how- 
ever, it is convenient to be able to compute "t" percent points 
directly, accurately, and simply, without the need of extensive 
tables except, perhaps, a normal (Gaussian) table; but see Section 5. A 
simply derived, or retrievable , computational procedure for doing so is 
presented in this paper. It can be carried out quickly on a hand- 
held calculator and has been programmed, for instance, for the 
TI-59, the TRS-80 and the HP-41C. It seems that the accuracy of 
the numerical values obtained, especially at usually required 
levels (e.g., 95%) --but also at much more extreme ones--coupled 
with the ease of their computation, should provide a tempting 
argument for their wide use. 

Several similar approximations have appeared in various 
journals over the last two decades. Among the most successful of 
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these is that derived by Peizer and Pratt (1968) , hereafter 
abbreviated PP : 

1 

a PP = { nex P[ z (a) (n - -g-) /(n - -j + -^— ) ]-n) , (1.1) 

where a is the right single-tail probability, so 0 < a £ 0 . 5 . 

Approximations based on asymptotic expansions appeared earlier 

(Wallace 1958, 1959) and were successful for moderate degrees 

of freedom and not-too-extreme tail areas. Other approaches 

have involved rational functions in the degrees of freedom 

(Gardiner and Bombay 1965, Kramer 1966) or the logistic distribution 

(Mudholkar and Chaubey 1975) . A formula due to Koehler 1983 is 

based on a novel data-analytic approach to the t-tables, pioneered 

by Hoaglin; let t (a)., represent Koehler's values. Further 

n 

accurate approximations are reviewed by Bailey 1980. 

Often, suggested approximations are either simple but not 
terribly accurate, or else are extremely complicated, involving 
many coefficients. The present approach offers both simplicity 
and a high degree of accuracy, yielding two digit accuracy or 
better for moderate degrees of freedom, across a broad range of 
tail areas. We call it a retrievable recipe because the simple 
basic idea allows it to be rederived quickly when needed. 

2 . DERIVATION 

Examination of an extensive table of Student's t, or some 
mathematical analysis, shows that for a < 0.5 there is a mono- 
tonically increasing transformation that stretches a Normal 
quantile z (a) into a Student's t quantile, t n (a). Let 
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t n (a) = ip n (z(a)), with ip- ( • ) representing the transform. 

We search for a simple approximation to * ( • ) ; call it i/> ( • ) . 
By definition. 



z (a) 

/ 



— 00 



2 
~ u 

2 , 

e d u 

/ 2 ? 



t n (a) 2 _ (n+1) 

/ C (n) (1 + — ) 2 dt = 1-a (2.1) 



where C(n) is the normalizing constant. Equivalently, 



/ Z 




/2tt 



ip*(z) 2 (n+1) 

f C (n) (1 + — ) 2 dt. 

' n 



( 2 . 2 ) 



Differentiation of both sides with respect to z now leads to 
2 



-z 



/2 



, .2 (n+1) , . * , > 

(z) “S (z) 

= C (n) (1 + -5- ) 



7T 



n 



dz 



(2.3) 



Our approximation has origin in the fact that t n (a) approaches 
z(a) and d^*(z)/dz -* 1 as n becomes large for fixed a. Conse- 
quently, simply allow the approximation ^ ( z) to satisfy 



-z 



/2tF 



, , , 2 (n+1) 

= C(n) (1 + ) 



n 



(2.4) 



2 

for every n. Solving (2.4) for ip^(z) leads to an expression of 
the following general form: 



H (n) z 2 (a) 

t 2 (a) ~ ^n (z(a)) = n{K(n)e 2 ~ 1} • (2.5) 
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But for a = 0.5, z (a) = t (a) = 0, so K(n) = 1 for all n. In 
order to determine H(n), consider matching expectations of random 
variables. On the left-hand side of (2.5), E(t 2 ) = Var[t n ] = 
the right-hand side requires the evaluation 



E [exp{H (n) Z 2 /2} ] = / exp{H (n) z 2 /2}exp{-z 2 /2}//2ndz 

—00 



= [1 - H (n) ] 1/2 



where Z is a unit normal random variable. Notice also that this 

evaluation may be recovered easily from the moment generating 

2 

function of the Xj_ distribution function. Thus for second moment 
matching , 

_1 

-rV = n[ (l-H(n) ) 2 - 1] 

n-2 



and so 



H(n) - (2n-3) / (n-1) 2 . (2.6) 

Our suggested first approximation is, then, 

1 

t (a)„ v = [n exp{z 2 (a) (n - 3/2) / (n-1) 2 } - n] 2 . (2.7) 

n GK 

for a < 0.50. Notice that this expression strongly resembles the 
Peizer-Pratt approximation, but has a somewhat different exponent 
Numerical examples, displayed later, also suggest that it is of 
acceptable accuracy, usually being somewhat superior to that of 
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Peizer and Pratt. A distinctive feature of the above approxima- 
tion, termed GK(I) for short, is its intuitively appealing and 
easily recollected derivation: it is retrievable . Note that 

this expression is convenient for simulating t-values, as in Ury 
(1980). Iteration of the expression (i.e., replacing z by t 
on the right-hand side of (2.7)) yields samples from even longer- 
tailed distributions; such may be useful in robustness studies. 

3. IMPROVING THE ACCURACY OF THE APPROXIMATION 

A 

Before numerically comparing the accuracy of t n (a) p p with 
GK(I), we consider a method for improving the accuracy as follows. 
Let us assume that the true value of Student's t can be written 
as in (2.7) but with a slightly different tail area; i.e., with 
a* a function of ' a-‘ 

1 

t (a) = (n exp [z (a*) 2 (n - y) / (n-1) 2 ] -n} 2 . (3.1) 

n ^ 

Upon rewriting (3.1) , we see that 

1 

a* (n) = $Uln(l + t 2 (a)/n)] [(n-l) 2 /(n-|)]} 2 , (3.2) 



where $ denotes the standard Gaussian cumulative distribution 
function. Now Figure 1 shows that ln(a*(n)-a) is roughly linear 
in ln(n), for several values of a. The least squares estimates 
for the slope and intercept for a few values of a are shown in 
Table 1. A typical value for the slope is taken to be -1.86; 
the intercept behaves like -3+0.62(ln a). Thus 



(a* - a) 



-3 .62 , 1.86 
e a /n 
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or 



or 



a + 0.04979 (a/n J ) 



62 



(3.3) 



So our improved percent point should be 



A 




(a) GK (II) 



t n (a*) . 



(3.4) 



Note that the adjustment to a in (3.3) decreases rapidly as n 

increases. Of course, the above correction is empirical and 

doubtless can be further improved. Unfortunately, it is not easily 

retrieved in a manner analogous to the derivation of t (a)„„, TN . 

n GK ( I) 



4 . COMPARING THE APPROXIMATIONS 

Figure 2 compares the accuracy of the three approximations (1.1), 
(2.7), (3.4), and Koehler's formula as a function of x = -10 log 

(tail area), for n = 6, 10, 20, 30, by plotting the relative error 

A 

[= (t (a) - t n (a) ) /t (a) ] . Notice that in all the graphs, the simple 

approximation given by GK(I) (2.7) is slightly better than that 
suggested by Peizer and Pratt. Considerable improvement is at- 
tained using the adjusted value of ct given by GK(II) 25 in (3.4). 

A few values of each approximation are tabulated in Table 2 
and compared with the true percentage points. Notice that, while 
GK (II) is initially worse than GK(I) for low degrees of freedom, 
it results in an extra digit of accuracy for moderate n and extremely 
small a. In fact, GK(II) yields 2-3 decimals of accuracy for 
n ^ 10 over the entire range of a considered, 0.05 to 0.000001. 
Koehler's formula is better for small n (n = 4) and moderate a 
(a > 0.025) , and is about the same as GK(I) and GK(II) when n is 
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very large (n = 60) . However, the choice of approximation at n = 60 
is possibly academic, as many users would be satisfied with Gaussian 
percent points for such large degrees of freedom. In brief, 

GK(II) obtains an extra digit of accuracy for extreme tail areas 
and moderate degrees of freedom. Notice that the correction fac- 
tor is essentially 0 for large n, so there is no advantage of GK(li) 
over GK(I) for n greater than, say 30. 

All approximations requiring z (a) used formula (26.2.23) from 
AMS 55 (Abramowitz and Stegun 1968) in the table and figures of 
comparisons. It may be noted that the approximation GK(I) , (2.7) , 

may be inverted to determine approximate probability values (so- 
called "p-values") . A table of the Gaussian distribution, or an 
approximation to the Gaussian percent points, is required. 

5. TOWARDS A SIMPLE STAND-ALONE APPROXIMATION 

It is tempting to calculate our t-value approximations, which 
depend upon tabulated normal values, with the aid of approximate 
normal values that can be computed easily from scratch. The 
result is a stand-alone t-value approximation, accurate to nearly 
two digits over a surprisingly large range. 

Here is a suggested way of proceeding. Tukey's A-distribution 
(see Tukey 1970, as referred to in McNeil 1977, p. 88) provides 

z (a) = $” 1 (l-2a;A) = ( /T/22 A /2 A) [ ( 1-a) A - a A ] ; (5.1) 

with A = 0.14 it yields inverse normal values to 3-digit accuracy 
down to a = 0.01. In order to extend fairly satisfactorily to 
a = 10 , proceed as follows: put a = 10 and write 
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/ 



(5.2) 



00 



/ 

z (u) 



exp{-z 2 /2 }//2 tF dz 



10 



-u 



so 



00 

- u In 10 = In / exp{-z 2 /2 }//2 tF dz 

z (u) 



(5.3) 



Now differentiate, and examine the result as u becomes large 
(cf. Feller 1957, p. 193) : 



In 10 




00 



/ 

z (u) 



1 2 

2 Z , 
e dz 



dz (u) 
du 





dz (u) 
du 



(5.4) 



= Z (u) 



dz (u) 
du 



Integration gives (for "large" u, here 2 < u <_ 6) 



z (u) 




Uq) + 2 (In 10) (u - u Q ) 



(5.5) 



Take u^ = -log(O.Ol) = 2, z(Uq) = z T (0.01) = 2.58 and replace 
2 In 10 by 4.32 to achieve slightly better results. Then utilize 
these numbers to find, for a < 0.01 



z T (a) 




( 0 . 01 ) 



+ 4 .32 (- log (2 a ) - 2) 



(5.6) 



In summary, use the following prescription for the normal 
values : 
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(5.7) 



z T (a) = 4 .476 [ (1-d) 0,14 -a 0 * 14 ] , 10 2 £ a £ 0.5 

= /-4 . 32 log a - 3.284 , 10 -6 < a < 10~ 2 

with close to 2-digit accuracy throughout the stated range. 
Refinement or improvement is possible, but at the apparent 
price of a more elaborate representation. 

Table 2 includes t-values computed using the normal approxi- 

A 

mation (5.7). These are labelled t (.a) . 

FI Vjri\ \ 1 1 1 / 
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Table 1 



Linear 


fits of In (a* - a) 


vs ln(n) 


a 


Slope 


Intercept 


.05 


- 2.876 


-3.447 


.02 


-1.308 


-8.487 


.01 


-1.809 


-6.495 


.005 


-1.828 


- 6.473 


.001 


-1.928 


-6.800 


.0005 


-1.943 


-7.138 


.00005 


-1.930 


-8.683 


.00001 


-1.927 


- 9.872 


.000005 


-1.822 


-10.664 


.000001 


-1.771 


-11.978 



Table 2 



Comparing approximations 



ngle tail 


area .05 


.025 


.01 


.005 


. 001 




1 0Log (tail 


area ) (13) 


(16) 


(20) 


(23) 


(30) 




= 4 
True 


2. 132 


2.776 


3.747 


4 .604 


7. 171 


1 1 


K 


2.139 


2.776* 


3. 708 


4.509 


6.853 


12 


PP 


2.134* 


2.787 


3-780 


4.667 


7.379 


13 


GK( I ) 


2.118 


2.763 


3.741* 


4. 613* 


7.266 


13 


GK(II) 


2.107 


2,748 


3. 716 


4.575 


7.165* 


13 


GK(III) 


2.134 


2.790 


3-773 


4.628 


7.402 


13 


: 10 


True 


1 .812 


2.228 


2.764 


3-169 


4.144 


5 


K 


1.823 


2.242 


2.778 


3.182 


4 . 147* 


5 


PP 


1.813 


2.230 


2.767 


3.174 


4.155 


5 


GK ( I ) 


1 .812* 


2.229* 


2.766 


3-173 


4.153 


5 


GK(II) 


1.811 


2.227* 


2.764* 


3.170* 


4 . 147* 


5 


GK(III) 


1.824 


2.245 


2.781 


3.179 


4. 196 


5 


: 20 
True 


1.725 


2.086 


2.528 


2.845 


3-552 


4 


K 


1.728 


2.090 


2.531 


2.847 


3-552* 


4 


PP 


1.725* 


2.087 


2.529 


2.847 


3.554 


4 


GK ( I ) 


1.725* 


2.087 


2.529 


2.847 


3-554 


4 


GK(II) 


1.725* 


2.086* 


2.528* 


2.840* 


3-553 


4 


GK(III) 


1.735 


2.100 


2.54 1 


2.851 


3.583 


4 


: 30 


True 


1.697 


2.042 


2.457 


2.750 


3.385 


4 


K 


1 . 697* 


2.042* 


2.455 


2.746 


3.379 


4 


PP 


1.697* 


2.043 


2.458* 


2.751* 


3-386* 


4 


GK ( I ) 


1.698 


2.043 


2.458* 


2.751* 


3.386* 


4 


GK(II) 


1.697* 


2.043 


2.458* 


2.751* 


3-386* 


4 


GK(III) 


1.707 


2.056 


2.470 


2.755 


3.412 


4 


: 60 


True 


1.671 


2.000 


2.390 


2.660 


3.232 


3 


K 


1.668 


1.996 


2.383 


2.650 


3- 218 


3 


PP 


1.671* 


2.001* 


2.391* 


2.661* 


3.232* 


3 


GK( I ) 


1 .668 


1.996 


2.383 


2.650 


3. 218 


3 


GK(II) 


1.668 


1.996 


2.383 


2.650 


3. 218 


3 ■ 


GK(III) 


1 . 680 


2.013 


2.401 


2.665 


3.255 


3. 



0001 

(40) 



.559 
.365* 
.798 
.510 
.09 1 
.828 



. 694 
.684 
.721 
.718 
. 701 * 
• 7o2 



.539 

.553 

.543 

.542 

.540* 

.580 



.234 
.239 
. 236 * 
. 236 * 
. 236 * 
.267 



.962 
.953 
.963* 
• 953 
953 
989 



* indicates closest approximation to true value 
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